Kaluza Analysis Tree Plots Offer a Fast and Comprehensive Overview of Multicolor Flow Cytometry Data
This white paper introduces the Tree Plot as a supervised approach for comprehensive deconvolution of flow cytometry data, bridging the gap between fully manual analysis and unsupervised computational analysis approaches. Step-by-step instructions for the generation of Tree Plots using Kaluza Analysis Flow Cytometry software are provided.
Manual Flow Cytometry Data Analysis
Flow cytometry data is traditionally analyzed by sequential gating. The expression of markers is visualized one by one or two at a time and populations are identified subjectively by positioning so called gates on the displayed distributions in the histograms.
All events contained within this parent gate are then further assessed for other markers and more child gates are defined, building a gating hierarchy that sequentially excludes more and more events from the analysis, until only the population of interest remains, see Figure 1.4
Boolean gating allows the user to combine previously defined gates using logical operators such as AND, NOT and OR to determine whether the events in the gate, or a combination of gates, are to be included or excluded.
Figure 1. Sequential Gating of Selected B-cell Subsets. Human peripheral blood was stained using the DURAClone IM B Cell Antibody Panel (Part Number B53328) according to IFU and acquired on a CytoFLEX LX flow cytometer. Data was analyzed in Kaluza Analysis. Cell populations are identified by sequential gating and Boolean gating.
Both approaches rely on user-defined gates and are highly subjective. Classical gating strategies have been shown to lack reproducibility and have high inter-operator variability.5 In addition, it requires a lot of hands on time, as for a comprehensive review of all markers combinations in a 10-color panel, 45 histograms need to be evaluated.7
Excluding events can introduce operator-specific bias. Data analysis in this case is solely driven by prior knowledge and assumptions made by the person defining the gating strategy. Populations not previously described in the literature or aberrant marker expression patterns may not be readily recognized.3
Computational Flow Cytometry Data Analysis
More recently, computational approaches such as dimensionality reduction and clustering techniques have become popular. Algorithms such as viSNE and FlowSOM (Figure 2, example of SNE and FlowSOM) are unbiased and allow the discovery of previously unrecognized similarities between cell populations.2,7 Clustering algorithms have been shown to match the results of manual gating and provide greater reproducibility.1 They may still be time consuming to run, but generally do not require the same amount of hands on time as manual analysis. Additionally, the commercially available solutions that do not require programming skills to run do not support the inclusion of expert knowledge that might aid in checking the plausibility of the results.6
Figure 2. Computational Analysis Using the Kaluza R Plugin. Computational analysis of DURAClone IM B Cell Antibody Panel (Part Number B53328) using the Kaluza R Console plugin. Scripts provided with the R Console plugin example files were adapted to run on CD19+ gated events. A) A tsne analysis was performed including all fluorescence channels except CD45. Sequential gating was performed in parallel to allow for color backgating on the tSNE clusters. B) Star plots of CD19+ events after FlowSOM analysis.
Tree Plot Visualization of Flow Cytometry Data
The Tree Plot offered by Kaluza Analysis software bridges the gap between analysis guided by expert knowledge, comprehensiveness and speed. It allows the assessment of physical characteristics of all events included in an analysis in a supervised manner. Unexpected marker combinations will become readily visible. Tree Plots provide a useful data comparison tool, as one Tree Plot can condense data from up to 28 bivariate plots, thus reducing the time required for data interpretation once the analysis strategy has been set up. The Tree Plot classifies events into different classes/populations based on gating information. Events are assigned to a class depending on whether they are contained within or excluded from a given gate. The tree plot does not have any axes and thus no axis information. The maximum number of gates that can be used for classification on a tree plot is limited to eight. Since the number of classes is 2 to the power of the number of gates, with eight gates, a single tree plot can display 256 populations. Tree Plots are especially effective when cell populations are characterized by markers showing a bi-modal or binary expression pattern.
Figure 3 shows the components of a Tree Plot including:
- Branches, which are used to categorize cell populations based on whether they have a negative or positive result for a specified phenotypic data type. Branches are located at the top of the plot.
- Bars, which allow for the comparison of possible negative/positive branch combination between two cell populations. Bars are the central focus of the Tree Plot, as they are the pictorial representation of this phenotypic classification system. Bars can be viewed as either Count or % Gated.
Figure 3. Tree Plot Display Comparing Co-expression Patterns. Tree Plot display of IgM/IgD- CD19+ cells (blue bars) and IgM/IgD+ CD19+ cells (green bars) comparing co-expression patterns of CD27 and CD38 as branches of the plot. Data was generated using DURAClone IM B Cells Antibody Panel (Part Number B53328).
Step-by-Step Creation of Kaluza Tree Plot
1. Create plots displaying every phenotype you wish to include in your Tree Plot as Branch or Bar, see Figure 4.
2. On each plot, create a gate that includes the events that are positive for the respective phenotype. Kaluza will automatically label bars and branches that include the phenotype with + and those that exclude the phenotype with -. Therefore, it is best to label the gate with the population name only, omitting + or – for CD markers (Figure 4).
Figure 4. Identification of Populations and Phenotypes with the DURAClone IM B Cell Antibody Panel. Cells were stained using the DURAClone IM B Cell Antibody Panel (Part Number B53328) following the IFU. A) Sequential gating to identify CD19+ lymphocytes. The CD19+ population will be used as input gate. B) CD19+ lymphocytes are further divided based on their IgM and IgD expression levels. A Boolean gate was created to allow exclusion of IgM- IgD+ and IgM+ IgD- cells from the Tree Plot. The IgM/IgD gate will be used to define the bars of the Tree Plot. C) Histogram plots were created to visualize CD38 and CD27 expression. To facilitate identification of the positivity cut-off Lymphocytes were chosen as input gate. The “Divider” gate type was used to mark the cut-off between positive and negative events. The positive side of the Divider was labelled with the name of the marker displayed.
3. Select the Tree Plot icon from the Plots & Tables ribbon tab to add a new Tree Plot.
4. Choose the input gate (if needed) to filter your data by selecting the [Ungated] hyperlink located at the top of the plot and choosing the gate from the pop-up menu.
5. Use the <Choose Branches> hyperlink to choose the Branches of the tree; Branches can be any gate within the Data Set. Each Branch added to the tree further classifies gated events based on whether they are positive or negative for the phenotypic characteristic defined in the Branch(es) of greater precedence. A legend for the Bars, including the colors and the definition of the positive/negative phenotypic data classification specifically associated with each bar, is, by default, located at the bottom of the plot.
6. Select the appropriate Y-axis data type for viewing the bars. The default measurement type is Count. If you wish to change the measurement type to % Gated, select the <Count> hyperlink, and from the pop-up list, choose % Gated.
Figure 5. Tree Plot Setup for the DURAClone IM B Cells Antibody Panel. A) A new Tree Plot was created and CD19+ chosen as input gate. B) CD27 was chosen as the first branch. Divider gates are found in the Quadrant category of the gate list. C) CD38 was chosen as the second branch. The Tree Plot now contains bars representing all 4 for possible combinations of CD27 and CD38 expression. C) IgM/IgD was added as bars. E) The y-axis scale was switched to %Gated and different colors assigned to the bars by accessing the Tree Plot Radial menu and selecting the Coloring option.
7. For further analysis of any population visualized as a bar in a Tree Plot, these events can automatically be extracted by the generation of a Boolean gate. You may use Bars from Tree Plots as an input gate for other plots, including other Tree Plots. To gate a plot using a Bar, press the (Alt) key and select and drag the Bar onto the appropriate plot and release your mouse button to complete the process, see Figure 6.
Figure 6. Boolean Gate Creation from Tree Plot. Human peripheral blood was stained using the DURAClone IM T Cell Antibody Panel (Part Number B53328) according to IFU and acquired on a CytoFLEX LX flow cytometer. Data was analyzed in Kaluza Analysis. A) The CD45RA+ CCR7+ CD4+ bar (aqua), was selected while holding down the ALT key and dragged onto a CD27 vs. CD28 dot plot (step 1) which then changes the display to show only those relevant events (step 2). B) A Boolean gate representing the chosen bar was automatically created as illustrated in the Boolean Gates dialogue box.
Tip 1: As you hover your mouse over a Bar, the names of the gates, including the positive or negative classification, displays in the Branches that are associated to that Bar.
Tip 2: Tree Plots may also be used to compare two samples by merging two data files and separating them by gating as shown in Figure 7.
Figure 7. Tree Plot Comparing Control and Test Sample. Control sample (blue) and test sample (green) were merged in Kaluza as shown by the time plot. CD19+ events were used as input population and CD27 and CCR7 as branches. The CCR7 and CD27 histograms illustrate the gate placement. TheTree Plot compares % gated on the control sample (blue) and a test sample (green) as bars. Data was generated by Thomas Liechti, Huldrych Günthard and Alexandra Trkola in Cytometry Part A.8
Summary
While assumption driven manual data analysis may not be ideal for data exploration and discovery of novel populations, for defined research questions and analysis of known cellular subsets it is still relevant today. With the patented Tree Plot Kaluza offers a tool for fast and comprehensive deconvolution of multicolor data, rapid generation of statistical results and comparison between samples. Tree Plots are especially valuable for the analysis of markers that show a bimodal expression pattern.
References
- Aghaeepour, N, Chattopadhyay, P, Chikina, M, et al. A Benchmark for Identification of Cellular Correlates of Clinical Outcomes. Cytometry Part A 2016:89(1). doi.org/10.1002/cyto.a.22732.
- Amir, ED, Davis, KL, Tadmor, MD, et al. ViSNE enables visualization of high dimensional single-cell data and reveals phenotypic heterogeneity of leukemia. Nature Biotechnology 2013:31(6). doi.org/10.1038/nbt.2594.
- Irish, JM. Beyond the age of cellular discovery. Nature Immunology 2014:15(12). doi.org/10.1038/ni.3034.
- Lugli, E, Roederer, M, Cossarizza, A. Data analysis in flow cytometry: The future just started. Cytometry Part A 2010:77A(7). doi.org/10.1002/cyto.a.20901.
- Maecker, HT, Rinfret, A, D’Souza, P, et al. [No title found]. BMC Immunology 2005:6(1). doi.org/10.1186/1471-2172-6-13.
- Reiter, M, Rota, P, Kleber, F, et al. Clustering of cell populations in flow cytometry data using a combination of Gaussian mixtures. Pattern Recognition 2016:60. doi.org/10.1016/j.patcog.2016.04.004.
- Saeys, Y, Gassen, SV, Lambrecht, BN. Computational flow cytometry: Helping to make sense of high-dimensional immunology data. Nature Reviews Immunology 2016:16(7). doi.org/10.1038/nri.2016.56.
- Liechti, T, Günthard, HF, Trkola, A. OMIP-047: High-Dimensional phenotypic characterization of B cells. Cytometry Part A 2018:93. doi:10.1002/cyto.a.23488.
For Research Use Only. Not for use in diagnostic procedures.